Systems and Methods for Resynchronizing Information

ABSTRACT

Methods and systems for synchronizing data files in a storage network between a first and a second storage device is provided. The method includes storing first data files associated with the first storage device to a storage medium, whereby the first data files include first data records. The storage medium may then be transferred to the second storage device. The first data files from the storage medium may be loaded onto the second storage device. The second data records from the first storage device may be received, and the first and second data records are compared. The first data files at the second storage device may be updated based on the comparison of the first and second data records.

RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.11/640,024, filed on Dec. 15, 2006, which claims the benefit under 35U.S.C. §120 from Provisional Application No. 60/752,201, filed December,19, 2005 and is incorporated herein by reference.

This application is related to the following patents and pendingapplications, each of which is hereby incorporated herein by referencein its entirety:

-   -   Application titled “Systems and Methods for Classifying and        Transferring Information in a Storage Network” filed Dec. 19,        2005, attorney docket number 4982/75;    -   Application Ser. No. 60/752,198 titled “Systems and Methods for        Granular Resource Management in a Storage Network” filed Dec.        19, 2005, attorney docket number 4982/84;    -   Application Serial No. not known, titled “Systems and Methods        for Performing Multi-Path Storage Operations” filed Dec. 19,        2005, attorney docket number 4982/88;    -   Application Ser. No. 60/752,196 titled “System and Method for        Migrating Components in a Hierarchical Storage Network” filed        Dec. 19, 2005, attorney docket number 4982/95.    -   Application Ser. No. 60/752,202 titled “Systems and Methods for        Unified Reconstruction of Data in a Storage Network” filed Dec.        19, 2005, attorney docket number 4982/97;    -   Application Ser. No. 60/752,197 titled “Systems and Methods for        Hierarchical Client Group Management” filed Dec. 19, 2005,        attorney docket number 4982/102

BACKGROUND OF THE INVENTION

The invention disclosed herein relates generally to performing datatransfer operations in a data storage system. More particularly, thepresent invention relates to facilitating data synchronization between asource and destination device in a storage operation system.

Performing data synchronization is an important task in any system thatprocesses and manages data. Synchronization is particularly importantwhen a data volume residing in one location in a system is to bereplicated and maintained on another part of the system. Replicated datavolumes may be used, for example, for backup repositories, data stores,or in synchronous networks which may utilize multiple workstationsrequiring identical data storage.

File replication may include continually capturing write activity on asource computer and transmitting this write activity from the sourcecomputer to a destination or target computer in real-time or nearreal-time. A first step in existing file replication systems, asillustrated in FIG. 1A, is a synchronization process to ensure that thesource data 22 at a source storage device and the destination data 24 ata destination storage device are the same. That is, before a destinationcomputer 28 may begin storing write activity associated with the sourcedata 22 at a source computer 26, the system 20 needs to first ensurethat the previously written source data 22 is stored at the destinationcomputer 28.

Problems in existing synchronization processes may occur as a result oflow or insufficient bandwidth in a network connection 30 over which thesource and destination computers 26, 28 communicate. Insufficientbandwidth over the connection 30 ultimately causes bottlenecks andnetwork congestion. For example, if the rate of change of data at thesource computer 26 is greater than the bandwidth available on thenetwork connection 30, data replication may not occur since data at thesource computer 26 will continue to change at a faster rate than it canbe updated at the destination computer 28. Therefore, the attempts tosynchronize the source and destination computers 26, 28 may continueindefinitely without success and one set of data will always lag behindthe other.

Additional synchronization problems may arise due to hardware failure.If either the source computer 26 or the destination computer 28 were tofail, become unavailable, or have a failure of one of its storagecomponents, application data may still be generated without system 20being able to replicate the data to the other storage device. Neithercomputers 26 or 28 possess means of tracking data changes during such afailure. Other possible sources of disruption of replication operationsin existing systems may include disrupted storage paths, brokencommunication links or exceeding the storage capacity of a storagedevice.

Additionally, some existing synchronization systems maintain continuityacross multiple storage volumes using a wholesale copy routine. Such aroutine entails periodically copying the most or all contents of astorage volume across the network to replace all the previousreplication data. A storage policy or network administrator may controlthe operations and determine the frequency of the storage operation.Copying the entire contents of a storage volume across a network to areplication storage volume may be inefficient and can overload thenetwork between the source computer 26 and the destination computer 28.Copying the entire volume across the network connection 30 between thetwo computers causes the connection 30 to become congested andunavailable for other operations or to other resources, which may leadto hardware or software operation failure, over-utilization of storageand network resources and lost information. A replication operation asdescribed above may also lack the capability to encrypt or secure datatransmitted across the network connection 30. A replication operationthat takes place over a public network, such as the Internet, orpublicly accessible wide area network (“WAN”), can subject the data tocorruption or theft.

SUMMARY OF THE INVENTION

In accordance with some aspects of the present invention, a method ofsynchronizing data files with a storage operation between a first and asecond storage device is provided. The method may include storing firstdata files associated with the first storage device to a storage medium,whereby the first data files include first data records. The storagemedium may then be transferred to the second storage device. The firstdata files from the storage medium may be stored on the second storagedevice. The second data records from the first storage device may bereceived, and the first and second data records may be compared. Thefirst data files at the second storage device may be updated based onthe comparison of the first and second data records.

In accordance with other embodiments of the present invention, a methodof synchronizing data after an interruption of data transfer between afirst and a second storage device is provided. The method may includedetecting an interruption in the data transfer between the first and thesecond storage device, and comparing first logged data records in afirst data log associated with the first storage device with secondlogged records in a second data log associated with the second storagedevice. Updated data files from the first storage device may then besent to the second storage device based on comparison the first and thesecond logged records.

One embodiment of the present invention includes a method ofsynchronizing data between a first and second storage device. The methodmay include identifying a first set of data on a first storage devicefor replication and capture the set of data in a first log entry.Changes to the first set of data may be determined and recorded as asecond set data in a suitable log or data structure for recording suchdata. Next, the first and second set of data may be transmitted to thesecond storage device and any changes replicated in the second storagedevice.

Another embodiment of the present invention includes a method ofsynchronizing data after an interruption of data transfer between afirst and a second storage device. When an interruption in the datatransfer between the first and the second storage device is detected,the first logged data records in a first data log associated with thefirst storage device are compared with second logged records in a seconddata log associated with the second storage device. Updated data filesfrom the first storage device are then sent to the second storage devicebased on comparing the first and the second logged records.

In yet another embodiment, a method of replicating data on an electronicstorage system network is presented. A set of data, including a recordidentifier, is stored on a first storage device and copied to anintermediary storage device. The set of data from the intermediarystorage device may then be transferred to a third storage device. Therecord identifier of the set of data on the third storage device maythen be compared to the record identifier of the set of data on thefirst storage device. The set of data on the third storage device isupdated upon detection of non-identical record identifiers, wherein theupdated data files are transmitted across the storage network.

In another embodiment, a system for replicating data on an electronicstorage network is presented. The system includes a first and secondstorage device, a first log, for tracking changes to data stored on thefirst storage device, and a replication manager module. The replicationmanager module transmits updated data from the first log to the secondstorage device.

In another embodiment, a computer-readable medium having stored thereona plurality of sequences of instructions is presented. When executed byone or more processors the sequences cause an electronic device to storechanges to data on a first storage device in a first log includingrecord identifiers. Updated data is transmitted from the first log to asecond log on a second storage device where the record identifier of thedata from the first log is compared to the record identifier of the datafrom the second log. The second storage device is updated with theupdated data upon detecting a difference in the record identifiers.

In another embodiment, a computer-readable medium having stored thereona plurality of sequences of instructions is presented. When executed byone or more processors the sequences cause an electronic device todetect a failure event in a data replication operation between first andsecond storage devices. Updates of a first set of data are stored in thefirst storage device. A second set of data detailing the updates to thefirst set of data is logged. The second set of data also includes arecord identifier which is compared to a record identifier of the secondstorage device. The updates to the first set of data, identified by thesecond set of data, are replicated on the second storage device.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention is illustrated in the figures of the accompanying drawingswhich are meant to be exemplary and not limiting, in which likereferences are intended to refer to like or corresponding parts, and inwhich:

FIG. 1 is a block diagram of a prior art system;

FIG. 2 is a block diagram of a system for performing storage operationson electronic data in a computer network according to an embodiment ofthe invention;

FIG. 3A is a block diagram of storage operation system componentsutilized during synchronization operations according to an embodiment ofthe invention;

FIG. 3B is an exemplary data format associated with logged data entriesaccording to an embodiment of the invention;

FIG. 4A is a block diagram of storage operation system componentsutilized during synchronization operations in accordance with anotherembodiment of the invention.

FIG. 4B is an exemplary data format associated with logged data recordentries according to an embodiment of the invention;

FIG. 5 is a flowchart illustrating some of the steps involved inreplication according to an embodiment of the invention;

FIG. 6 is a flowchart illustrating some of the steps involved inreplication according to an embodiment of the invention; and

FIG. 7 is a flowchart illustrating some of the steps involved inreplication according to another embodiment of the invention.

DETAILED DESCRIPTION OF THE INVENTION

Detailed embodiments of the present invention are disclosed herein,however, it is to be understood that the disclosed embodiments aremerely exemplary of the invention, which may be embodied in variousforms. Therefore, specific functional details disclosed herein are notto be interpreted as limiting, as a representative basis for teachingone skilled in the art to variously employ the present invention in anyappropriately detailed embodiment.

With reference to FIGS. 2-7, exemplary aspects of embodiments andfeatures of the present invention are presented. Turning now to FIG. 2,a block diagram of a storage operation cell 50 that may perform storageoperations on electronic data in a computer network in accordance withan embodiment of the present invention is illustrated. As shown, storageoperation cell 50 may generally include a storage manager 100, a dataagent 95, a media agent 105, a storage device 115, and, may includecertain other components such as a client computer 85, a data orinformation store 90, databases 110,111, a jobs agent 120, an interfacemodule 125, a management agent 130, and a resynchronization agent 133.Such system and elements thereof are exemplary of a modular storagemanagement system such as the CommVault QiNetix™ system, and also theCommVault GALAXY™ backup system, available from CommVault Systems, Inc.of Oceanport, N.J. , and further described in U.S. Pat. No. 7,035,880,which is incorporated herein by reference in its entirety.

A storage operation cell, such as cell 50, may generally includecombinations of hardware and software components associated withperforming storage operations on electronic data. Exemplary storageoperation cells according to embodiments of the invention may include,as further described herein, CommCells as embodied in the QNet storagemanagement system and the QiNetix storage management system by CommVaultSystems of Oceanport, New Jersey. According to some embodiments of theinvention, storage operations cell 50 may be related to backup cells andprovide some or all of the functionality of backup cells as described inapplication Ser. No. 10/877,831 which is hereby incorporated byreference in its entirety.

Storage operations performed by storage operation cell 50 may includecreating, storing, retrieving, and migrating primary data copies andsecondary data copies (which may include, for example, snapshot copies,backup copies, HSM (Hierarchical Storage Management) copies, archivecopies, and other types of copies of electronic data). Storage operationcell 50 may also provide one or more integrated management consoles forusers or system processes to interface with in order to perform certainstorage operations on electronic data as further described herein. Suchintegrated management consoles may be displayed at a central controlfacility or several similar consoles distributed throughout multiplenetwork locations to provide global or geographically specific networkdata storage information. The use of integrated management consoles mayprovide a unified view of the data operations across the network.

A unified view of the data operations collected across the entirestorage network may provide an advantageous benefit in the management ofthe network. The unified view may present the system, or systemadministrator with a broad view of the utilized resources of thenetwork. Presenting such data to one centralized management console mayallow for a more complete and efficient administration of the availableresources of the network. The storage manager 100, either via apreconfigured policy or via a manual operation from a systemadministrator, can reallocate resources to more efficiently run thenetwork. Data paths from storage operation cells may be re-routed toavoid areas of the network which are congested by taking advantage ofunderutilized data paths or operation cells. Additionally, should astorage operation cell arrive at or exceed a database size maximum,storage device capacity maximum or fail outright, several routes ofredundancy may be triggered to ensure the data arrives at the locationfor which it was intended. A unified view may provide the manager with acollective status of the entire network allowing the system to adapt andreallocate the many resources of the network for faster and moreefficient utilization of those resources.

In some embodiments, storage operations may be performed according to astorage policy. A storage policy generally may be a data structure orother information source that includes a set of preferences and otherstorage criteria for performing a storage operation and/or otherfunctions that relate to storage operation. The preferences and storagecriteria may include, but are not limited to, a storage location,relationships between system components, network pathway to utilize,retention policies, data characteristics, compression or encryptionrequirements, preferred system components to utilize in a storageoperation, and other criteria relating to a storage operation. Forexample, a storage policy may indicate that certain data is to be storedin a specific storage device, retained for a specified period of timebefore being aged to another tier of secondary storage, copied tosecondary storage using a specified number of streams, etc. In oneembodiment, a storage policy may be stored in a storage manager database111. Alternatively, certain data may be stored to archive media asmetadata for use in restore operations or other storage operations. Inother embodiments, the data may be stored to other locations orcomponents of the system.

A schedule policy specifies when and how often to perform storageoperations and may also specify performing certain storage operations(i.e. replicating certain data) on sub-clients of data including how tohandle those sub-clients. A sub-client may represent static or dynamicassociations of portions of data of a volume and are generally mutuallyexclusive. Thus, a portion of data may be given a label and theassociation is stored as a static entity in an index, database or otherstorage location used by the system. Sub-clients may also be used as aneffective administrative scheme of organizing data according to datatype, department within the enterprise, storage preferences, etc. Forexample, an administrator may find it preferable to separate e-mail datafrom financial data using two different sub-clients having differentstorage preferences, retention criteria, etc.

Storage operation cells may contain not only physical devices, but alsomay represent logical concepts, organizations, and hierarchies. Forexample, a first storage operation cell 50 may be configured to performHSM operations, such as data backup or other types of data migration,and may include a variety of physical components including a storagemanager 100 (or management agent 130), a media agent 105, a clientcomponent 85, and other components as described herein. A second storageoperation cell may contain the same or similar physical components,however, it may be configured to perform storage resource management(“SRM”) operations, such as monitoring a primary data copy or performingother known SRM operations.

In one embodiment a data agent 95 may be a software module or part of asoftware module that is generally responsible for archiving, migrating,and recovering data from client computer 85 stored in an informationstore 90 or other memory location. Each computer 85 may have at leastone data agent 95 and a resynchronization agent 133. Storage operationcell 50 may also support computers 85 having multiple clients (e.g.,each computer may have multiple applications, with each applicationconsidered as either a client or sub-client).

In some embodiments, the data agents 95 may be distributed betweencomputer 85 and the storage manager 100 (and any other intermediatecomponents (not explicitly shown)) or may be deployed from a remotelocation or its functions approximated by a remote process that performssome or all of the functions of the data agent 95. The data agent 95 mayalso generate metadata associated with the data that it is generallyresponsible for replicating, archiving, migrating, and recovering fromclient computer 85. This metadata may be appended or embedded within theclient data as it is transferred to a backup or secondary storagelocation, such as a replication storage device, under the direction ofstorage manager 100.

One embodiment may also include multiple data agents 95, each of whichmay be used to backup, migrate, and recover data associated with adifferent application. For example, different individual data agents 95may be designed to handle MICROSOFT EXCHANGE® data, MICROSOFT SHAREPOINTdata or other collaborative project and document management data, LOTUSNOTES® data, MICROSOFT WINDOWS 2000® file system data, MICROSOFT ActiveDirectory Objects data, and other types of data known in the art.Alternatively, one or more generic data agents 95 may be used to handleand process multiple data types rather than using the specialized dataagents described above.

In an embodiment utilizing a computer 85 having two or more types ofdata, one data agent 95 may be used for each data type to archive,migrate, and restore the client computer 85 data. For example, tobackup, migrate, and restore all of the data on a MICROSOFT EXCHANGE2000® server, the computer 85 may use one MICROSOFT EXCHANGE 2000®Mailbox data agent to backup the EXCHANGE 2000® mailboxes, one MICROSOFTEXCHANGE 2000® Database data agent to backup the EXCHANGE 2000®databases, one MICROSOFT EXCHANGE 2000® Public Folder data agent tobackup the EXCHANGE 2000® Public Folders, and one MICROSOFT WINDOWS2000® File System data agent to backup the file system of the computer85. These data agents 95 would be treated as four separate data agents95 by the system even though they reside on the same computer 85.

In an alternative embodiment, one or more generic data agents 95 may beused, each of which may be capable of handling two or more data types.For example, one generic data agent 95 may be used to back up, migrateand restore MICROSOFT EXCHANGE 2000® Mailbox data and MICROSOFT EXCHANGE2000® Database data while another generic data agent may handleMICROSOFT EXCHANGE 2000® Public Folder data and MICROSOFT WINDOWS 2000®File System data.

While the illustrative embodiments described herein detail data agentsimplemented, specifically or generically, for Microsoft applications,one skilled in the art should recognize that other application types(i.e. Oracle data, SQL data, Lotus Notes, etc.) may be implementedwithout deviating from the scope of the present invention.

Resynchronization agent 133 may initiate and manage system backups,migrations, and data recovery. Although resynchronization agent 133 isshown as being part of each client computer 85, it may exist within thestorage operation cell 50 as a separate module or may be integrated withor part of a data agent (not shown). In other embodiments,resynchronization agent 133 may be resident on a separate host. As aseparate module, resynchronization agent 133 may communicate with all orsome of the software modules in storage operation cell 50. For example,resynchronization agent 133 may communicate with storage manager 100,other data agents 95, media agents 105, and/or storage devices 115.

In one embodiment, the storage manager 100 may include a software module(not shown) or other application that may coordinate and control storageoperations performed by storage operation cell 50. The storage manager100 may communicate with the elements of storage operation cell 50including computers 85, data agents 95, media agents 105, and storagedevices 115.

In one embodiment the storage manager 100 may include a jobs agent 120that monitors the status of some or all storage operations previouslyperformed, currently being performed, or scheduled to be performed bythe storage operation cell 50. The jobs agent 120 may be linked with aninterface module 125 (typically a software module or application). Theinterface module 125 may include information processing and displaysoftware, such as a graphical user interface (“GUI”), an applicationprogram interface (“API”), or other interactive interface through whichusers and system processes can retrieve information about the status ofstorage operations. Through the interface module 125, users mayoptionally issue instructions to various storage operation cells 50regarding performance of the storage operations as described andcontemplated by embodiment of the present invention. For example, a usermay modify a schedule concerning the number of pending snapshot copiesor other types of copies scheduled as needed to suit particular needs orrequirements. As another example, a user may utilize the GUI to view thestatus of pending storage operations in some or all of the storageoperation cells in a given network or to monitor the status of certaincomponents in a particular storage operation cell (e.g., the amount ofstorage capacity left in a particular storage device). As a furtherexample, the interface module 125 may display the cost metricsassociated with a particular type of data storage and may allow a userto determine the overall and target cost metrics associated with aparticular data type. This determination may also be done for specificstorage operation cells 50 or any other storage operation as predefinedor user-defined (discussed in more detail below).

One embodiment of the storage manager 100 may also include a managementagent 130 that is typically implemented as a software module orapplication program. The management agent 130 may provide an interfacethat allows various management components in other storage operationcells 50 to communicate with one another. For example, one embodiment ofa network configuration may include multiple cells adjacent to oneanother or otherwise logically related in a WAN or LAN configuration(not explicitly shown). With this arrangement, each cell 50 may beconnected to the other through each respective management agent 130.This allows each cell 50 to send and receive certain pertinentinformation from other cells 50 including status information, routinginformation, information regarding capacity and utilization, etc. Thesecommunication paths may also be used to convey information andinstructions regarding storage operations.

In an illustrative embodiment, the management agent 130 in the firststorage operation cell 50 may communicate with a management agent 130 ina second storage operation cell regarding the status of storageoperations in the second storage operation cell. Another illustrativeexample may include a first management agent 130 in a first storageoperation cell 50 that may communicate with a second management agent ina second storage operation cell to control the storage manager (andother components) of the second storage operation cell via the firstmanagement agent 130 contained in the storage manager 100 of the firststorage operation cell.

Another illustrative example may include the management agent 130 in thefirst storage operation cell 50 communicating directly with andcontrolling the components in the second storage management cell 50,bypassing the storage manager 100 in the second storage management cell.In an alternative embodiment, the storage operation cells may also beorganized hierarchically such that hierarchically superior cells controlor pass information to hierarchically subordinate cells or vice versa.

The storage manager 100 may also maintain, in an embodiment, an indexcache, a database, or other data structure 111. The data stored in thedatabase 111 may be used to indicate logical associations betweencomponents of the system, user preferences, management tasks, StorageResource Management (SRM) data, Hierarchical Storage Management (HSM)data or other useful data. The SRM data may, for example, includeinformation that relates to monitoring the health and status of theprimary copies of data (e.g., live or production line copies). HSM datamay, for example, be related to information associated with migratingand storing secondary data copies including archival volumes to variousstorage devices in the storage system. As further described herein, someof this information may be stored in a media agent database 110 or otherlocal data store. For example, the storage manager 100 may use data fromthe database 111 to track logical associations between the media agents105 and the storage devices 115.

From the client computer 85, resynchronization agent 133 may maintainand manage the synchronization of data both within the storage operationcell 50, and between the storage operation cell 50 and other storageoperation cells. For example, resynchronization agent 133 may initiateand manage a data synchronization operation between data store 90 andone or more of storage devices 115. Resynchronization agent 133 may alsoinitiate and manage a storage operation between two data stores 90 andassociated storage devices, each in a separate storage operation cellimplemented as primary storage. Alternatively, resynchronization agent133 may be implemented as a separate software module that communicateswith the client 85 for maintaining and managing resynchronizationoperations.

In one embodiment, a media agent 105 may be implemented as a softwaremodule that conveys data, as directed by the storage manager 100,between computer 85 and one or more storage devices 115 such as a tapelibrary, a magnetic media storage device, an optical media storagedevice, or any other suitable storage device. Media agents 105 may belinked with and control a storage device 115 associated with aparticular media agent. In some embodiments, a media agent 105 may beconsidered to be associated with a particular storage device 115 if thatmedia agent 105 is capable of routing and storing data to particularstorage device 115.

In operation, a media agent 105 associated with a particular storagedevice 115 may instruct the storage device to use a robotic arm or otherretrieval means to load or eject a certain storage media, and tosubsequently archive, migrate, or restore data to or from that media.The media agents 105 may communicate with the storage device 115 via asuitable communications path such as a SCSI (Small Computer SystemInterface), fiber channel or wireless communications link or othernetwork connections known in the art such as a WAN or LAN. Storagedevice 115 may be linked to a data agent 105 via a Storage Area Network(“SAN”).

Each media agent 105 may maintain an index cache, a database, or otherdata structure 110 which may store index data generated during backup,migration, and restore and other storage operations as described herein.For example, performing storage operations on MICROSOFT EXCHANGE® datamay generate index data. Such index data provides the media agent 105 orother external device with a fast and efficient mechanism for locatingthe data stored or backed up. In some embodiments, storage managerdatabase 111 may store data associating a computer 85 with a particularmedia agent 105 or storage device 115 as specified in a storage policy.The media agent database 110 may indicate where, specifically, thecomputer data is stored in the storage device 115, what specific fileswere stored, and other information associated with storage of thecomputer data. In some embodiments, such index data may be stored alongwith the data backed up in the storage device 115, with an additionalcopy of the index data written to the index cache 110. The data in thedatabase 110 is thus readily available for use in storage operations andother activities without having to be first retrieved from the storagedevice 115.

In some embodiments, certain components may reside and execute on thesame computer. For example, a client computer 85 including a data agent95, a media agent 105, or a storage manager 100 coordinates and directslocal archiving, migration, and retrieval application functions asfurther described in U.S. Pat. No. 7,035,880. Thus, client computer 85can function independently or together with other similar clientcomputers 85.

FIG. 3A illustrates a block diagram of a system 200 of system storageoperation system components that may be utilized during synchronizationoperations on electronic data in a computer network in accordance withan embodiment of the present invention. The system 200 may compriseCLIENT 1 and CLIENT 2 for, among other things, replicating data. CLIENT1 may include a replication manager 210, a memory device 215, a logfilter driver 220, a log 225, a file system 230, and a link to a storagedevice 235. Similarly, CLIENT 2 may include a replication manager 245, amemory device 250, a log filter driver 255, a log 260, a file system265, and a storage device. Additional logs 261 may also reside on CLIENT2 in some embodiments.

In one embodiment, replication manager 210 may be included inresynchronization agent 133 (FIG. 2). Replication manager 210, in oneembodiment, may manage and coordinate the replication and transfer ofdata files between storage device 235 and a replication volume. Aspreviously described in relation to FIG. 2, resynchronization agent 133may be included in client computer 85. In such an embodiment,replication manager 210 may reside within resynchronization agent 133(FIG. 2) in a client computer. In other embodiments, the replicationmanager 210 may be part of a computer operating system (OS). In suchembodiments, for example, client computer 85 (FIG. 2) may communicateand coordinate the data replication processes with the OS.

In the exemplary embodiment of FIG. 3A, the replication process betweenCLIENT 1 and CLIENT 2 in system architecture 200 may occur, for example,during a data write operation in which storage data may be transferredfrom a memory device 215 to a log filter driver 220. Log filter driver220 may, among other things, filter or select specific application dataor other data that may be parsed as part of the replication process thatis received from the memory device 215. For example, ORACLE data, SQLdata, or MICROSOFT EXCHANGE data may be selected by the log filterdriver 220. The log filter driver 220 may, among other things, include aspecific application or module that resides on the input/output (“I/O”)stack between the memory device 215 and the storage device 235. Oncewrite data passes through the memory device 215 towards the file system230, the write data is intercepted and processed by the log filterdriver 220. As the write data is intercepted by the log filter driver220, it is also received by the file system 230. The file system 230 maybe responsible for managing the allocation of storage space on thestorage device 235. Therefore, the file system 230 may facilitatestoring the write data to the storage device 235 associated with CLIENT1.

In order to replicate the filtered write data that is received from thememory device 215, the log filter driver 220 may send filtered writedata to the log 225. The log 225 may include metadata in addition towrite data, whereby the write data entries in log 225 may include a dataformat 300, such as that illustrated in FIG. 3B. Metadata may includeinformation, or data, about the data stored on the system. Metadata,while generally not including the substantive operational data of thenetwork is useful in the administration, security, maintenance andaccessibility of operational data. Examples of metadata include filessize, edit times, edit dates, locations on storage devices, versionnumbers, encryption codes, restrictions on access or uses, and tags ofinformation that may include an identifier for editors. These are mereexamples of common usages of metadata. Any form of data that describesor contains attributes or parameters of other data may be consideredmetadata.

As illustrated in FIG. 3B, the data format of the logged write dataentries in the log 225 may include, for example, a file identifierfield(s) 302, an offset 304, a payload region 306, and a timestamp 309.Identifier 302 may include information associated with the write data(e.g., file name, path, size, computer device associations, userinformation, etc.). Timestamp field 309 may include a timestampreferring to the time associated with its log entry, and in someembodiments may include a indicator, which may be unique, such as USN.

Offset 304 may indicate the distance from the beginning of the file tothe position of the payload data. For example, as indicated by theillustrative example 308, the offset may indicate the distance of thepayload 310 from the beginning of the file 312. Thus, using the offset314 (e.g., offset=n), only the payload 310 (e.g., payload n) thatrequires replicating is sent from storage device 235 (FIG. 3A) to thereplication volume storage device. Thereby replicating only that portionof the data that has changed. The replication process may be sent overthe network, for example, the communication link 275 (FIG. 3A) toanother client, CLIENT 2.

As indicated in FIG. 3A, at CLIENT 2, write data associated with the log225 of CLIENT 1 may be received by the log 260 of CLIENT 2 via thecommunication link 275. The write data may then be received by the filesystem 265 of CLIENT 2 prior to being stored on the replication volumeat the storage device (the replication volume).

Referring to FIG. 3A, changes captured by filter driver 220 on CLIENT 1may later be used to replicate the write data entries utilizing the log225, if, for example, a communication failure occurs between CLIENT 1and CLIENT 2 due to a network problem associated with communication link275. If the failure is of limited duration the log 225 will not beoverwritten by additional data being logged. Therefore, provided thatduring a network failure, the log 225 has enough storage capacity tostore recent entries associated with the write data, the log 225 may beable to successfully send the recent write data entries to a replicationvolume upon restoration of communication.

The write data entries in the log 225 of CLIENT 1 may accumulate overtime. Replication manager 210 of CLIENT 1 may periodically direct thewrite data entries of the log 225 to be sent to a storage device havingthe replication volume. During a network failure, however, the storagecapacity of the log 225 may be exceeded as a result of recent loggedentries associated with the write data. Upon such an occurrence, the logfilter driver 220 may begin to overwrite the oldest entries associatedwith the write data. Replication of the write data associated with theoverwritten entries may not be possible. Thus, the present embodimentallows for a full synchronization of data files between the storagedevice 235 and a replication volume which may be necessary to ensure thedata volume in the storage device 235 associated with CLIENT 1 isreplicated at the replication volume.

In one embodiment, the storage manager 100 (FIG. 2) may monitor andcontrol the network resources utilized in the replication operations.Through a defined storage policy, or interactive interfacing with asystem administrator, the storage manager 100 may reallocate networkresources (e.g. storage operation paths, storage devices utilized, etc).Reallocating the resources of the network may alleviate the concentratedtraffic and bottlenecks created by these types of situations inreplication operations.

FIG. 4A illustrates a block diagram 280 of storage operation systemcomponents that may be utilized during synchronization operations onelectronic data in a computer network in accordance with anotherembodiment of the present invention. System 280 is similar to system 200(FIG. 3A) and use like reference numbers to designate generally likecomponents. As shown, system 280 may include CLIENT 1 and CLIENT 2 for,among other things, replicating data. CLIENT 1 may include a replicationmanager 210, a memory device 215, a log filter driver 220, one or morelog files 225, a change journal filter 240, a change journal 241, a filesystem 230, and a storage device 235. Similarly, CLIENT 2 may include areplication manager 245, a memory device 250, one or more log files 260,261, and a file system 265. The one or more log files 260, 261 may beutilized for different application types, such as, SQL data, MICROSOFTEXCHANGE data, etc.

In one embodiment, the replication manager 210 may be included in theresynchronization agent 133 (FIG. 2). The replication manager 210, inone embodiment may manage and coordinate the replication of data filesbetween storage device 235 and a replication volume. As previouslydescribed in relation to FIG. 2, resynchronization agent 133 may beincluded in client computer 85. In such an embodiment, the replicationmanager 210 may reside within resynchronization agent 133, in a clientcomputer. In other embodiments, replication manager 210 may be part of acomputer operating system (OS). In such embodiments, for example, theclient computer 85 (FIG. 2) may communicate and coordinate the datareplication processes with the OS.

In the exemplary embodiment of FIG. 4A, the replication process betweenCLIENT 1 and CLIENT 2 in the system architecture 280 may occur, forexample, during a data write operation in which storage data may betransferred from the memory device 215 of CLIENT 1 to a storage device235 via the file system 230. The write data from the memory 215 device,however, may be intercepted by the log filter driver 220. As previouslydescribed, the log filter driver 220 may, among other things, trap,filter or select intercepted application data received from memory 215.For example, ORACLE data, SQL data, or MICROSOFT EXCHANGE data may beselected by the log filter driver 220. Once the write data passesthrough and is captured by the log filter driver 220, the write data maybe received by the change journal filter driver 240.

Change journal filter driver 240 may also create data records thatreflect changes made to the data files (e.g., write activity associatedwith new file creation, existing file updates, file deletion, etc.)stored on the storage device 235. These data records, once selected bythe change journal filter driver 240, may be stored as records in thechange journal 241. The replication manager 210 may then utilize thesechange journal 241 record entries during replication operations ifaccess to the log file 225 entries, which may have ordinarilyfacilitated the replication process as further described herein, isunavailable (e.g., corrupted, deleted, or overwritten entries). Writedata may then be received at the file system 230 from the change journalfilter driver 240, whereby the file system 230 may be responsible formanaging the allocation of storage space and storage operations on thestorage device 235, and copying/transferring data to the storage device235.

In order to replicate the filtered write data that is received from thememory device 215, the log filter driver 220 may send write datafiltered by the log filter driver 220 to the log 225. The log 225 mayinclude metadata in addition to write data payloads, whereby the writedata entries in the log 225 may include the data format 300, previouslydescribed and illustrated in relation to FIG. 3B.

As previously described in relation to the embodiments of FIGS. 3A and3B, the present invention provides for replication operations duringboth normal and failure occurrences between CLIENT 1 and CLIENT 2 due tonetwork problems (e.g., failure in communication link 275). In oneembodiment, the filter driver 220 captures changes in the write datathat may later be used to replicate write data entries utilizing the log225, provided the failure is of limited duration and the log 225 goesnot get overwritten. Therefore, provided that during a network failure,the log 220 has enough storage capacity to store recent entriesassociated with the write data, the log filter driver 220 may be able tosuccessfully send the recent write data entries to the replication uponrestoration of communication.

The write data entries in the log 225 of CLIENT 1 may accumulate overtime. The replication manager 210 of CLIENT 1 may periodically directthe write data entries of the log 225 to be sent to the replicationvolume. During a network failure, however, the storage capacity of thelog 225 may be exceeded as a result of recent logged entries associatedwith the write data. Replication of write data associated with theoverwritten entries may not be possible. Thus, under these conditions,the change journal 241 entries captured by the change journal filterdriver 240 may enable the replication of write data without the need fora full synchronization of data files between the storage devices 235 anda replication volume. As previously described, full synchronization mayrequire a transfer of the entire storage volume stored at the storagedevice 235 linked to CLIENT 1 to the replication volume of CLIENT 2. Thepresent embodiment is advantageous as a full synchronization operationsmay place a heavy burden on network resources, especially consideringthe large data volume that may reside on the storage device 235. Inaddition to the large data transfer requirement during this operation,other data transfer activities within the storage operation system mayalso create further network bottlenecks.

With the implementation of the change journal filter driver 240 and thechange journal 241, the requirement for a full synchronization may beobviated. The changed data entries in change journal 241 may allow forthe replication manager to selectively update the replicated datainstead of requiring a full synchronization that may occupy valuablenetwork resources better suited for other operations.

FIG. 4B illustrates some of the data fields 400 associated with entrieswithin the change journal log 241 according to an embodiment of theinvention. The data fields 400 may include, for example, a recordidentifier 402 such as an Update Sequence Number (USN), metadata 404,and a data object identifier 406 such as a File Reference Number (FRN).The data object identifier 406 may include additional informationassociated with the write data (e.g., file name, path size, etc.). Eachrecord logged or entered in change journal 241 via change journal filterdriver 240 may have a unique record identifier number that may belocated in the record identifier field 402. For example, this identifiermay be a 64-bit identifier such as a USN number used in the MICROSOFTWindows® OS change journal system. Each of the records that are createdand entered into the change journal 241 is assigned such a recordidentifier. In one embodiment, each of the assigned identifiers issequentially incremented with the creation of a newly created recordreflecting a change to the data of the client. For example, an assignedidentifier (e.g., USN) associated with the most recent change to a fileon the storage device may include the numerically greatest recordidentifier with respect to all previously created records, therebyindicating the most recent change. The metadata field 404 may include,among other things, a time stamp of the record, information associatedwith the sort of changes that have occurred to a file or directory(e.g., a Reason member), etc. In some embodiments, a FRN associated withthe data object identifier 406 may include a 64-bit ID that uniquelyidentifies any file or directory on a storage volume such as that of thestorage device 235 (FIG. 4A).

In accordance with an embodiment of the invention, as further describedherein, the record identifier fields 402 (FIG. 4B) of each logged recordentered in change journal 241 may be utilized to resynchronizereplication operations in conjunction with replication managers 210, 245and one or more of the log files 260, 261. Based on the recorded entriesin change journal 241, the replication manager 210 of CLIENT 1 maycoordinate the transfer of files that are to be replicated withreplication manager 245 of CLIENT 2. This may be accomplished asfollows. Change journal 241 logs all changes and assigns a USN or FRN toeach log entry in log 242 (FIG. 4A). Each log entry may include atimestamp indicating its recordation in log 242. Periodically,replication manager 210 may send the most recent USN copied to log 242to the destination. Next, change journal 241 may be queried for changessince the last USN copied, which indicates the difference between thelog at the source and the log at the destination, and only those logentries are replicated. This may be thought of as “resynchronizing”CLIENT 1 and CLIENT 2.

Once the transfer of files has been coordinated by replication managers210, 245, the designated files may be sent over communication link 275to the one or more log files 260, 261. The files received are thenforwarded from the one or more log files 260, 261 to the replicationvolume.

FIG. 5 is a flowchart 500 illustrating some of the steps involved in areplication process in a storage operation system under substantiallynormal operating conditions according to an embodiment of the invention.The replication process of FIG. 5 may be described with reference tosystem architecture 280 illustrated in FIG. 4A to facilitatecomprehension. However, it will be understood this merely represents onepossible embodiment of the invention and should not be construed to belimited to this exemplary architecture.

As shown, at step 502, it may be determined whether any write data(e.g., application specific data) is available for transfer to thestorage device 235 of a first client, whereby the write data may requirereplication at the replication volume of a second client. If the writedata (e.g., application data) requiring replication exists, it may becaptured by the log filter driver 220 and logged in the log 225 (step504). Additionally, through the use of another data volume filterdriver, such as a MICROSOFT Change Journal filter driver, recordsidentifying any changes to files or directories (e.g., change journalrecords) on the storage device 235 of the first client may be capturedand stored in the change journal 241 (step 506).

In some embodiments, under the direction of the replication manager 210,the write data stored and maintained in the log 225 may be periodically(e.g., every 5 minutes) sent via a communications link 275, to thereplication volume of the second client. In an alternative embodiment,under the direction of the replication manager 210, the write datastored in the log 225 may be sent via the communications link 275, tothe replication volume when the quantity of data stored in the log 225exceeds a given threshold. For example, when write data stored to thelog 225 reaches a five megabyte (MB) capacity, all write data entries inthe log 225 may be replicated to the second client.

Also, in some embodiments, under the direction of the replicationmanager 210, record identifiers (e.g., USN numbers) stored in the changejournal 241 may also be periodically (e.g., every 5 minutes) sent viathe communications link 275 to the replication manager 245 of the secondclient. The replication manager 245 may store these record identifiersin a log file at CLIENT 2, or at another memory index, or data structure(step 508). In other embodiments, under the direction of the replicationmanager 210, each record written to the change journal 241 may bedirectly sent via the communications link 275 to the replication manager245.

At step 510, the record identifiers (e.g., USN numbers) sent via thecommunications link 275 and stored in the log file 260 may be comparedwith existing record identifiers. Based on a comparison between thegreatest numerical value of a record identifier received at the log 260and other record identifiers, replication data may be identified andreplicated to the data volume of the second client.

FIG. 6 is a flowchart 600 illustrating some of the steps involved in areplication resynchronization process in a storage operation systemaccording to an embodiment of the invention. The replication process ofFIG. 6 may be described with reference to system architecture 280illustrated in FIG. 4A to facilitate comprehension. However, it will beunderstood this merely represents one possible embodiment of theinvention and should not be construed to be limited to this exemplaryarchitecture.

At step 604, if a communication failure affecting replication or otherevent criteria, such as log file corruption, power failure, loss ofnetwork, for example, is detected or found and then restored, the mostrecent record identifier field (e.g., USN number) in the destination logmay be accessed and compared with the last record identifier receivedfrom the change journal log 241. The replication managers 210, 245 maycoordinate and manage the comparison of these record identifier fields,which may include, in one embodiment, comparing identifier values suchas USNs used in the MICROSOFT change journal (step 606).

As previously described, write operations or other activities (e.g.,file deletions) associated with each file are logged in the changejournal records having unique identification numbers (i.e., recordidentifier) such as a USN number. At step 606, an identification number(e.g., USN number) associated with the last record identifier fieldstored at the change journal 241 may be compared with an identificationnumber (e.g., USN number) associated with the most recent recordidentifier stored in the log 260 upon restoration of the communicationfailure or other event. If it is determined that these identificationnumbers (e.g., USN numbers) are not the same (step 608), this mayindicate that additional file activities (e.g., data write to fileoperations) may have occurred at the source location (i.e., CLIENT 1),during the failure. These changes may not have been replicated to thesecond client due to the failure. For example, this may be determined bythe last record identifier field's USN number from the change journal241 at the source having a larger numerical value than the USN numberassociated with the most recent record identifier field accessed fromthe log 260. In one embodiment, this may occur as a result of a logfilter driver 220 not capturing an event (e.g., a data write operation)or overwriting an event. This may, therefore, lead to a recordidentifier such as a USN number not being sent to log file 260associated with the replication data volume of the second client.

Since USN numbers are assigned sequentially, in an embodiment, thenumerical comparison between the last record identifier field's USNnumber stored at the log 260 and the most recent record identifierfield's USN number accessed from the change journal 241 may be used toidentify any files that may not have been replicated at the replicationvolume (step 610) of the second client. For example, if the last recordidentifier field's USN number (i.e., at log 241) is “5” and the mostrecently sent record identifier field's USN number (i.e., at log 260) is“2,” it may be determined that the data objects associated with USNnumbers “3, 4, and 5” have not yet be replicated to the second client.Once these data files have been identified (e.g., by data objectidentifiers such as FRNs in the change journal entries) (step 610), theymay be copied from the storage device 235 of the first client and sentover the communication link 275 to the second client (step 612). Thus,the data volumes associated with storage devices 235 and the replicationvolume may be brought back into sync without the need for resending (orre-copying) all the data files between the two storage devices.

In the exemplary embodiments discussed above, a communication failuremay generate an over-flow in the log 225, which in turn may cause a lossof logged entries. As, previously described, these lost entries inhibitthe replication process upon restoration of the communication failure.Other failures may also lead to a loss of logged entries in log 225. Forexample, these failures may include, but are not limited to, corruptedentries in log 225 and/or the inadvertent deletion or loss of entries inlog 225.

FIG. 7 is a flowchart 700 illustrating a replication process in astorage operation system according to another embodiment of theinvention. The replication process of FIG. 7 may also be described withreference to system architecture 280 illustrated in FIG. 4A tofacilitate comprehension. However, it will be understood this merelyrepresents one possible embodiment of the invention and should not beconstrued to be limited to this exemplary architecture.

The replication process 700 may, in one embodiment, be based on ensuringthat electronic data files at a source storage device are synchronizedwith electronic data files at a destination or target storage devicewithout the need to perform full synchronization operations over thestorage operation network.

At step 702, the data files stored on a first storage device 235 and therecord identifiers associated with the data records at the first storagedevice logged in change journal 241 may undergo a data transfer.Examples of certain data transfers include, but are not limited to, ablock level copy, storage to a first destination storage medium/mediasuch as magnetic media storage, tape media storage, optical mediastorage, or any other storage means having sufficient retention andstorage capacity.

At step 704, the first destination medium/media, holding data from thefirst storage device, may be transferred (e.g., by vehicle) to a seconddestination storage device of the second client in FIG. 4A. At step 706,the data stored on a first destination medium/media may be loaded ontothe second destination storage device.

Since copying the data from the first storage device 235 and journal log241 onto the first destination medium/media and transporting the firstdestination medium/media to the second destination storage device (e.g.,a storage device of the second client, (not shown)), the data files atthe first storage device 235 may have undergone changes during thistransit period. For example, one or more existing data files may havebeen modified (e.g., a data write operation), deleted or augmented atthe first storage device 235. In order to ensure that an up-to-datereplication of the data files is copied to the destination storagedevice, particularly in light of such changes, a synchronization of databetween the data files residing on both the first storage device 235 andthe destination storage device may be required.

At step 708, record identifiers such as the USN numbers associated witheach data record logged within the change journal 241 are compared withthe record identifiers associated with data loaded onto the seconddestination storage device. This process may be performed, as during thetime period between the first storage device 235 data files and therecord identifiers being copied to the first destination medium/mediaand being transferred to the second destination storage device, the datafiles at the first storage device 235 may have undergone changes (e.g.,modify, write, delete etc.). Based on these changes to the data files atthe first storage device 235, additional data record entries (e.g., thechange journal entries) may have been created in change journal 241.

At step 710, the process determines whether data files at the firststorage device 235 have changed compared to their copies stored at thedestination storage device. As previously described (step 708), this isachieved by comparing the record identifiers (e.g., USN numbers)associated with each data record logged within the change journal 241with the record identifiers associated with data loaded onto the seconddestination storage device. For example, if the USN numbers are thesame, at step 712 it may be determined that no synchronization of datais required as the data has not changed. Thus, there is an indicationthat the data files at the first storage device 235 have not changedsince being copied to the second destination storage device. However,for example, if at step 710 it is determined that the USN numbersassociated with each data record logged within the change journal 241are not the same as the USN numbers loaded onto the second destinationstorage device, the data files associated with the USN numbers that werenot loaded onto the second destination storage device may be sent via acommunication pathway from the first storage device 235 to the seconddestination storage device. Thus, the data files associated with thefirst storage device 235 (source location) are synchronized with thedata files at second destination storage device (target location).

Systems and modules described herein may comprise software, firmware,hardware, or any combination(s) of software, firmware, or hardwaresuitable for the purposes described herein. Software and other modulesmay reside on servers, workstations, personal computers, computerizedtablets, PDAs, and other devices suitable for the purposes describedherein. Software and other modules may be accessible via local memory,via a network, via a browser or other application in an ASP context orvia other means suitable for the purposes described herein. Datastructures described herein may comprise computer files, variables,programming arrays, programming structures, or any electronicinformation storage schemes or methods, or any combinations thereof,suitable for the purposes described herein. User interface elementsdescribed herein may comprise elements from graphical user interfaces,command line interfaces, and other interfaces suitable for the purposesdescribed herein. Screenshots presented and described herein can bedisplayed differently as known in the art to input, access, change,manipulate, modify, alter, and work with information.

While the invention has been described and illustrated in connectionwith preferred embodiments, many variations and modifications as will beevident to those skilled in this art may be made without departing fromthe spirit and scope of the invention, and the invention is thus not tobe limited to the precise details of methodology or construction setforth above as such variations and modification are intended to beincluded within the scope of the invention.

1. A method of replicating data on an electronic storage system network comprising: storing a set of data on a first storage device, the set of data including a record identifier; copying the set of data to an intermediary storage device; transferring the set of data from the intermediary storage device to a second storage device; comparing the record identifiers of the set of data on the third storage device to the record identifier of the set of data on the first storage device; and updating the set of data on the third storage device upon detection of non-identical record identifiers, the updated data transmitted across the network.
 2. The method of claim 1 wherein the record identifiers include an update sequence number.
 3. The method of claim 1 wherein the updated data includes a highest update sequence number.
 4. The method of claim 1 wherein the record identifiers include a file reference number. 